Combinatorial Entropy for Distinguishable Entities in Indistinguishable States 
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The combinatorial basis of entropy by Boltzmann can be written H — N^ 1 In W, where H is the 
dimensionless entropy of a system, per unit entity, TV is the number of entities and W is the number 
of ways in which a given realization of the system can occur, known as its statistical weight. Maxi- 
mizing the entropy ("MaxEnt") of a system, subject to its constraints, is then equivalent to choosing 
its most probable ("MaxProb") realization. For a system of distinguishable entities and states, W 
is given by the multinomial weight, and H asymptotically approaches the Shannon entropy. In 
general, however, W need not be multinomial, leading to different entropy measures. 

This work examines the allocation of distinguishable entities to non-degenerate or equally degen- 
erate, indistinguishable states. The non-degenerate form converges to the Shannon entropy in some 
circumstances, whilst the degenerate case gives a new entropy measure, a function of a multinomial 
coefficient, coding parameters, and Stirling numbers of the second kind. 

PACS numbers: 02.50.Cw, 02.50.Tt, 05.20.-y, 89.20.-a, 89.70.+C 
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I. INTRODUCTION 

Of the many interpretations of the entropy concept, the 
combinatorial (or ■probabilistic) basis of entropy was given 
by Boltzmann [T] and Planck [5] in the famous equation: 



S N = NS = fclnW 



(1) 



where Sn is the total thermodynamic entropy of the sys- 
tem, S is the entropy per unit entity, N is the number 
of entities, W is number of ways in which a specified re- 
alization of a system can occur, known as its statistical 
weight, and k is the Boltzmann constant. This can be 
rewritten to give the dimensionless entropy [H 121 |5j [6] : 
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If the weight is of multinomial form. i.e. W. 

entities in the ith state 
asymptotic limits N —> oo 



Hs J , where m is the number of 
from s such states, then in the 
rii — ► oo, \fi, using the Stirling 
approximation [7J, lnm! to In to — m (or using Sanov's 
theorem [8]), the entropy converges to the Shannon func- 
tion 0: 
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where pi — rii/N is the probability of the ith state. 
However, it must be recognised that a system may not 
be of multinomial weight. The best- known examples 
are the three distributions examined in quantum physics 

[loi nu na usi m] : 



• Degenerate Maxwell- Boltzmann statistics, in which dis- 
tinguishable entities are allocated to distinguishable 
states, with gi degenerate sub-states within each state 
(this reduces to the multinomial case for = l,Vi); 

• Bose-Einstein statistics, in which indistinguishable 
entities are allocated to distinguishable, degenerate 
states; and 

• Fermi-Dirac statistics, also with indistinguishable enti- 
ties allocated to distinguishable, degenerate states, but 
with a maximum of one entity per state; 

The weights and entropy functions of these statistics are 
well known [e.g. [TS1 EH HZ]- In such cases, maximisation 
of the combinatorial entropy defined by ([2| ( "MaxEnt" ) , 
subject to the constraints on a system, always yields the 
realization of maximum probability ("MaxProb") (or, in 
the non-asymptotic case, a distribution close to the max- 
imum) SI El E] • This provides a much stronger (purely 
probabilistic) definition of the entropy concept than that 
given by axiomatic or information-theoretic reasoning. 

The aims of this work are (i) to review the concept of 
distinguishability, so often used in physics and (ii) 

to derive the statistical weight and entropy of a system 
in which distinguishable entities (balls) are allocated to 
indistinguishable states (boxes) , for both non-degenerate 
and equally degenerate cases (j pITpV] ). Although this 
occupancy problem has a long history [71 [THJ HH1 [5D] 
and is included in combinatorial classification schemes 
[551 [531 [H] , its connection to entropy does not appear to 
have been examined previously. 



II. ON DISTINGUISHABILITY 
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The concept of distinguishability strongly affects the 
choice of statistic used for analysis. Firstly, the enti- 
ties and/or states of a system might be fundamentally 
indistinguishable (as is currently believed in quantum 
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physics); the statistic is thus pre-ordained. A second, 
more interesting case is when we choose whether to dis- 
tinguish the entities and/or states, based on the purpose 
for which the entropy measure will be used. This is il- 
lustrated by the allocation of physicists (entities) to the 
seats of a bus (boxes). Four scenarios arise: 

• A conference organiser is requested by physicists A, 
B and C for window seats, while X and Y require 
seats near the door; also, everyone is concerned about 
the likely argument between Q and T, should they be 
seated together. In this case, it is necessary to distin- 
guish both the physicists and seats, leading to degen- 
erate multinomial (Maxwell-Boltzmann) statistics (or 
a variant thereof, with a maximum of m physicists per 
seat). 

• The bus company wishes to model the wear and tear 
on its seats. Here they have no interest in distinguish- 
ing the physicists, but need to distinguish the seats. 
This leads to Bose-Einstein statistics (or an intermedi- 
ate variant). 

• Alternatively, the conference organiser does not have 
any seat-specific requests, but is concerned about who 
will sit together. Here the physicists are distinguishable 
but the seats are not, leading to a new type of statistic 
(examined herein). 

• Finally, a more disinterested observer (e.g. a traffic en- 
gineer) does not care who the passengers are, or where 
they sit, but needs to model whether the bus schedule 
is sufficient to meet demand. Here both the physicists 
and seats are indistinguishable. 

Such considerations lead naturally to the "subjective" 
(or "observer-dependent") view of the entropy concept, 
a viewpoint vigorously defended by Jaynes [5T] [c.f. |3J|B]. 
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FIG. 1: Allocation of distinguishable balls to indistinguish- 
able boxes. 
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TABLE I: Stirling numbers of the second kind { N k } . 



in which the zeroes (unfilled states) extend from Uk+i to 
n s . Of course, some Stirling numbers { ^ } permit only 
one realization, e.g.: 



III. THE NON-DEGENERATE CASE 

We now consider the number of ways in which N dis- 
tinguishable balls can be allocated to s non-degenerate, 
indistinguishable boxes, to give the realization {rii} of 
numbers of balls in each box (the boxes being unlabelled) , 
as shown in Figure 1. This statistical weight can be de- 
noted W D:I = {{ nun Z.., ns }}> with J2Ui «i = N.It is 
known |25j that the number of ways to arrange N distin- 
guishable balls in k non-empty indistinguishable boxes 
(for k < s) is given by the Stirling number of the sec- 
ond kind { ^ } , the first few values of which are listed in 
Table |TJ These satisfy the recurrence relation [25] : 

{ N k }=r k =n+H N ^h m={#}=i- (4) 



By combinatorial enumeration, it is readily determined 
' N \ ™+ w_ .. ~ „ {{ 3 j ^ }} = io and 
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ies not give Wdj; e.g. 
{{221}} — 15; it is their sum which gives the Stirling 
number { 3 } = 25. 
result: 



By definition, this gives the general 
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{{ n x ,n 2t :.,n k ,0,... t }} (5) 



m={{iv,oi l0 }}=i 
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TV times 
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(8) 



What can we say about Wd ; /? Firstly, it is unaf- 
fected by any zeroes amongst the rii, since we could ar- 
bitrarily add unfilled states to ([5|, without change. Sec- 
ondly, as the states are unlabelled, it is meaningless to 
permute the nf, e.g. {{ 2 J i }} and {{ 2.1,2}} refer to the 
same realization {2,2,1}. This is quite different to the 
multinomial weight; e.g. (2,2,1) and (2,1,2) are num- 
erically equal, but represent different realizations [2, 2, 1] 
and [2,1,2]. The D : I statistic thus has fewer realiza- 
tions than the multinomial case. 
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It can in fact be shown that: 
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(9) 



where Tj > is the number of occurrences of integer j in 
the set {rii}, or its repetitivity (note that the zeros are not 
counted), whence Hj=i r j = Proof of ([9| considers 
the successive filling of boxes: Wd-.i must equal the num- 
ber of ways to choose n\ balls from iV balls, multiplied 
by the number of ways to choose n 2 balls from N — rt\ 
balls, and so on, for A; boxes; the product must then 
be divided by the number of ways that each multiply- 
occurring integer j can occur in the set {rii}, given by 
Tj\, to account for the indistinguishable boxes. For exam- 
ple, {{2,2,1}} = = 15; or ! since the order 

of filling is immaterial, {{2,2,1}} = (f)(2) (Dm = 15 - 
Generalising this result: 
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(10) 



The product of binomial coefficients in ( 10 1 is simply the 
multinomial coefficient, giving ((9); the last form in @ 
can be used when k is not known in advance. □ 

The weight has been given previously [T51 [HO HH] m 
the form: 



W 



DJ 



(N;r 1 ,r 2 , . . . ,r N )' 



Nl 



N 

n (io- 
3=1 



(ii) 



Recognising each j term in (111 as one of the nj terms, 
this reduces to It is, however, much less useful for 
the derivation of an entropy function. 

Also needed is the sum of the weights, given by an 
incomplete Bell number [20 : 
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This reduces to the usual Bell number B 
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The non-asymptotic entropy, denoted can now be 



calculated using ^ and ^ : 
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where the In A"! term is brought inside the first sum 



= N. Application of the "traditional" 



asymptotic limits A^ — > 00, rii — > 00, Vi, using the Stir- 
ling approximation, as well as the corresponding limits 
r j^oo — 0, = k then gives, for k -t* 00: 

1 

H Dd = - ^pilnpi - lim ( — lnkl) = - ^pjlnpi 



8=1 1=1 

(14) 

Hd-.i therefore converges to the Shannon entropy ([3| in 
the traditional limits. However, this is not the full pic- 
ture, as shown by the following examples. 

Examples: The most probable realization(s) for both 
multinomial and D : I statistics, determined by enumer- 
ating all realizations, are listed for several values of A^ 
in Tables TT|[TTT for two situations: (i) N — s, and (ii) 
s — 3. To summarise: 

• As expected, the most probable realization of the 
multinomial statistic, subject only to the natural con- 
straint, corresponds in all cases to the uniform dis- 
tribution [N/s, . . . ,N/s], or, if unable to achieve this 
(due to quantisation of the balls or boxes), to a set 
of equiprobable local maxima centred on the uniform 
distribution. 

• In contrast, the D : I statistic for N — s gives a step 
function ("staircase") with many unfilled boxes. The 
number of filled states fc, the number of steps and the 
width of each step all increase with N , but the number 
of unfilled states (s — k) increases even more rapidly. 
This preferential bunching ("cohesion") of the filled 
states is very different to their preferential spreading 
in the multinomial case. In consequence, applying the 
"traditional" asymptotic limit rii — > 00, Vi (in addition 
to A" — > 00) is inappropriate for k < s, since it does 
not account for the stepped nature of the distribution 
(with rii <C 00 in many states), and also distorts the 
role of the r, . Thus for k < s, the asymptotic limit 



in (14 1 does not apply. On the other hand, for s = 3 



(an example of A^ 3> s), the most probable realiza- 
tions are again non-uniform, but all boxes are filled 
(except for A^ < 3). As evident from the table, the 
filling will become more uniform as N — > 00, which 
will be consistent with m — > 00, Vi; this then gives the 



asymptotic limit of ( 14 1. 
The D : I statistic thus has very different convergence 
properties to the multinomial, being more strongly de- 
pendent on small values of nf, in the limit A^ — > 00, 
it asymptotically approaches the Shannon entropy for 
N ^> s and k = s. Outside of these bounds, more 
detailed analysis is needed to identify any asymptotic 
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limits; until then, the non-asymptotic entropy H^.j ( 13 ) 
must be used. 



IV. THE EQUALLY DEGENERATE CASE 

We now consider a simple degenerate form of the D : I 
statistic, in which each indistinguishable state contains 
g indistinguishable sub-states, with rii m entities in each 
sub-state, whence J2m=i n i-m = n i- The weight can be 
denoted: 



W ft ; (9 ) = {{ ni 



N 
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(15) 



Using the reasoning of I III the weight of each realization 
is given by the weight (|9j of filling of the states, multi- 
plied by the number of ways of filling within each state; 
each component of the latter must contain a sum over the 
possible number of filled sub-states 7 = 1, min(n,, g). 
This gives, for fixed k: 
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where ru is the repetitivity of I in the set {n.i rn }. Using 
([5]) and (12), this simplifies to: 
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For non-degenerate states { n { } = l,Vi, hence (17 1 re- 
duces to ([9| . As with the non-degenerate case, the prod- 
ucts over i in ( 17 I can be extended to s instead of k, since 



for unfilled states we can take $3 7=1 { ° } = { 0} = 1> or 
alternatively 5(0,0) = 1. 

The non-asymptotic entropy for the simple degenerate 
case is, from ([2| and (17 1: 
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In the Stirling-approximate limits N — > 00, rii — > 00, Vi, 
for which r^oo = and = fc, we can see (e.g. from 
Table [I]) that each sum over 7 in (18) will be dominated 
by its largest term { 7 # }, where l<7j -C rij. Applying 
the Jordan limit { " } ~ a"/a! as n — » 00 [25 then gives: 
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This closely resembles the degenerate Maxwell- 
Boltzmann entropy Hmb = — J2i=i Pi m (.Pi/Si)j where 
gi is the degeneracy of state i [THl [THJ HZ]. However, as 
shown in §III| this form does not reflect the behaviour 
of this statistic when unfilled states or sub-states are 



present, for which H^.j^ (|18| must be used. 



j=i 



V. CONCLUSIONS 

The statistical weight for the allocation of distinguish- 
able entities to indistinguishable states is derived herein, 
for both non-degenerate or equally degenerate states. 
The weight is obtained as a function of a multinomial 
coefficient, a set of coding parameters, and (for the de- 
generate case) a set of Stirling numbers of the second 
kind or of incomplete Bell numbers. Using Boltzmann's 
combinatorial definition (the "Boltzmann principle" ) , the 
non-asymptotic entropy functions are then obtained. For 
fully filled states, the non-degenerate and degenerate en- 
tropies converge respectively to the Shannon and degen- 
erate Maxwell-Boltzmann functions, but not otherwise. 

This study illustrates the importance of the combi- 
natorial definition of entropy, for which the maximum 
entropy position ("MaxEnt") gives the most-probable 
( "MaxProb" ) realization of the system (or, in the non- 
asymptotic case, a distribution close to the maximum). 
For systems which follow the D : I statistic, blind ap- 
plication of MaxEnt based on the Shannon entropy ^ 
will give the most probable realization only in special 
circumstances. Given the long history of Bose-Einstein 
and Fermi-Dirac statistics in physics, for the alloca- 
tion of indistinguishable entities to distinguishable states 

nmmiiaEaHinainain], itis surprising that the 

entropy functions for the opposite occupancy problem do 
not appear to have been examined previously. 
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TABLE II: Most probable realizations for the multinomial statistic, using W = W mu ;t and ^ W = s 
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TABLE III: Most probable realizations for the D:I statistic, using W = Wd-.i (9} and J2 w = B ( N , s ) ^ 
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